Extraction and Semantic Annotation of Workshop Proceedings in HTML Using RML

نویسندگان

  • Anastasia Dimou
  • Miel Vander Sande
  • Pieter Colpaert
  • Laurens De Vocht
  • Ruben Verborgh
  • Erik Mannens
  • Rik Van de Walle
چکیده

Despite the significant number of existing tools, incorporating data into the Linked Open Data cloud remains complicated; hence discouraging data owners to publish their data as Linked Data. Unlocking the semantics of published data, even if they are not provided by the data owners, can contribute to surpass the barriers posed by the low availability of Linked Data and come closer to the realisation of the envisaged Semantic Web. rml, a generic mapping language based on an extension over rrml, the wc standard for mapping relational databases into rdf, offers a uniform way of defining the mapping rules for data in heterogeneous formats. In this paper, we present how we adjusted our prototype rml Processor, taking advantage of rml’s scalability, to extract and map data of workshop proceedings published in html to the rdf data model for the Semantic Publishing Challenge needs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Extraction from Unstructured and Ungrammatical Data Sources for Semantic Annotation

The internet has become an attractive avenue for global e-business, e-learning, knowledge sharing, etc. Due to continuous increase in the volume of web content, it is not practically possible for a user to extract information by browsing and integrating data from a huge amount of web sources retrieved by the existing search engines. The semantic web technology enables advancement in information...

متن کامل

Semantic Enhancement Engine: A Modular Document Enhancement Platform for Semantic Applications over Heterogeneous Content

Traditionally, automatic classification and metadata extraction have been performed in isolation, usually on unformatted text. SCORE Enhancement Engine (SEE) is a component of a Semantic Web technology called the Semantic Content Organization and Retrieval Engine (SCORE). SEE takes the next natural steps by supporting heterogeneous content (not only unformatted text), as well as following up au...

متن کامل

Building Semantic Intranets: What is Needed in the Annotation Toolbox?

While much of a company’s explicit knowledge can be found in text repositories, current content management systems only provide limited capabilities for structuring and making sense of documents. In the emerging Semantic Web, however, some of the traditional document search, interpretation and aggregation problems can be addressed by ontology-based semantic markup, which enables intelligent sea...

متن کامل

Fuzzy Neighbor Voting for Automatic Image Annotation

With quick development of digital images and the availability of imaging tools, massive amounts of images are created. Therefore, efficient management and suitable retrieval, especially by computers, is one of themost challenging fields in image processing. Automatic image annotation (AIA) or refers to attaching words, keywords or comments to an image or to a selected part of it. In this paper,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014